TREC 2016 Total Recall Track
نویسندگان
چکیده
The e-Discovery Team participated in the 2016 TREC Total Recall Track, Athome division, where thirty-four prejudged topics were considered using 290,099 emails of former Florida Governor Jeb Bush. The Team participated in TREC 2016 primarily to test the effectiveness of the standard search methodology it uses commercially to search for relevant evidence in legal proceedings: Predictive Coding 4.0 Hybrid Multimodal IST. The Team’s method uses a hybrid approach to continuous active learning with both manual searches and active machine learning based document ranking searches. This is a systematic process involving implementation of a variety of search functions by skilled searchers. The Team calls this type of search multimodal because all types of search methods are used. A single expert reviewer was used in each topic along with Kroll Ontrack’s search and review software, eDiscovery.com Review (EDR). The Team classified 9,863,366 documents as either relevant or irrelevant in all 34 review projects. A total of 34,723 documents were correctly classified as Relevant, as per the Team’s judgment and corrected standard. The 34,723 relevant documents were found by manual review of 6,957 documents, taking a total of 234.25 man-hours. This represent an average project time of 6.89 hours per topic. The Team thus reviewed and classified documents at an average speed of 42,106 files per hour. The Team’s attained an average 88% Recall score across all 34 topics using the corrected standard. The Team also attained F1 scores of greater than 90% in twelve topics, including two perfect scores of 100% F1.
منابع مشابه
San Francisco State University (SFSU) at Total Recall Track of TREC 2016
This paper describes the participation of San Francisco State University group in Text Retrieval Conference (TREC) 2016 Total Recall Track from National Institute of Standard and Technology (NIST). The TREC series provide large test collections and judgements for participant to design Information Retrieval (IR) systems for different proposes. The purpose of Total Recall Track is seeking text se...
متن کاملThe University of Padua (IMS) at TREC 2016 Total Recall Track
The participation of the Information Management System (IMS) Group of the University of Padua in the Total Recall track at TREC 2016 consisted in a set of fully automated experiments based on the two-dimensional probabilistic model. We trained the model in two ways that tried to mimic a real user, and we compared it to two versions of the BM25 model with different parameter settings. This initi...
متن کاملWebis at TREC 2016: Tasks, Total Recall, and Open Search Tracks
We give a brief overview of the Webis group’s participation in the TREC 2016 Tasks, Total Recall, and Open Search tracks. Our submissions to the Tasks track are similar to our last year’s system. In the task understanding subtask of the Tasks track, we use different data sources (ClueWeb12 anchor texts, AOL query log, Wikidata, etc.) and APIs (Google, Bing, etc.) to retrieve suggestions related...
متن کاملTREC 2016 Total Recall Track Overview
The primary purpose of the Total Recall Track is to evaluate, through controlled simulation, methods designed to achieve very high recall – as close as practicable to 100% – with a human assessor in the loop. Motivating applications include, among others, electronic discovery in legal proceedings [3], systematic review in evidencebased medicine [6], and the creation of fully labeled test collec...
متن کاملWHU at TREC Total Recall Track 2015
This paper describes the WHU IRLAB participation to the Total Recall Track in TREC 2015. We implement an end-to-end system to deal with the total recall task. We propose an iterative query expansion method, which construct queries using iteratively selected terms. We choose to participate the "Play-at-home" evaluation. Results are presented and discussed.
متن کاملWaterlooClarke: TREC 2015 Total Recall Track
The total recall track in TREC 2015 seeks an enhanced model to accelerate the autonomous technology-assisted review process. This paper introduces several noval ideas such as clustering based seed selection method, extended n-grams features and continuous query expansion learned from the relevant documents derived from each iteration. These methods can retrieve more relevant documents from each...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016